Posted 2025-04-23Updated 2026-02-15Note6 minutes read (About 870 words) visits

FCSGG Repo Explanation

FCSGG Repository Summary

FCSGG (Fully Convolutional Scene Graph Generation) is a PyTorch implementation of the paper “Fully Convolutional Scene Graph Generation” published in CVPR 2021. The project focuses on scene graph generation, which is the task of detecting objects in an image and identifying the relationships between them.

Core Components:

Architecture:
- Built on Detectron2, a popular object detection framework by Facebook
- Uses a one-stage detector approach (CenterNet) as the meta-architecture
- Supports various backbones including ResNet, HRNet (High-Resolution Network), Hourglass networks, and DLA
Key Features:
- Fully convolutional approach to scene graph generation
- Multiple backbone options with different feature pyramid networks (FPN, BiFPN, HRFPN)
- Various head designs including multiscale heads and attention mechanisms
- Support for different input resolutions and training strategies
Dataset:
- Primarily designed for the Visual Genome dataset, a large-scale dataset for scene understanding
- Includes custom data loaders and preprocessing for scene graph generation
Model Components:
- Backbones: Various CNN architectures (ResNet, HRNet, Hourglass, DLA)
- Necks: Feature pyramid networks and variants (FPN, BiFPN, HRFPN, Trident)
- Heads: Detection and relationship prediction heads
- Loss Functions: Custom losses for object detection and relationship prediction
Utilities:
- Visualization tools for scene graphs
- Evaluation metrics for scene graph generation
- Training and inference scripts

Project Structure:

fcsgg/: Main module containing model implementation
- modeling/: Neural network architecture components
  - backbone/: Feature extraction networks
  - necks/: Feature pyramid networks
  - heads/: Detection and relationship prediction heads
  - meta_arch/: High-level model architecture (CenterNet)
- data/: Dataset handling and preprocessing
- evaluation/: Metrics and evaluation code
- utils/: Helper functions and utilities
- layers/: Custom neural network layers
- structures/: Data structures for scene graphs
configs/: Configuration files for different model variants
tools/: Training, evaluation, and visualization scripts
GraphViz/: Visualization tools for scene graphs

Key Innovations:

The project implements a fully convolutional approach to scene graph generation, which differs from traditional two-stage methods. Instead of first detecting objects and then predicting relationships, it uses a one-stage detector to simultaneously predict objects and their relationships in a fully convolutional manner.

Benchmarks:

The repository provides several pre-trained models with different backbones:

HRNetW32-1S
ResNet50-4S-FPN×2
HRNetW48-5S-FPN×2

These models achieve competitive performance on the Visual Genome dataset for scene graph generation tasks.

Usage:

The project provides tools for training, evaluation, and visualization of scene graphs. It requires the Visual Genome dataset and can be run using Docker or directly with PyTorch.

In summary, FCSGG is a comprehensive implementation of a state-of-the-art approach to scene graph generation using fully convolutional networks, offering various model architectures and training configurations.

How Detectron2 is Used in FCSGG

FCSGG is built on top of Detectron2, Facebook’s object detection framework, and leverages many of its components while extending it for scene graph generation. Here’s a detailed breakdown:

1. Core Architecture Integration

Meta Architecture: FCSGG registers a custom meta architecture called “CenterNet” with Detectron2’s META_ARCH_REGISTRY. This extends Detectron2’s modular architecture system while maintaining compatibility.
Backbone Networks: FCSGG uses Detectron2’s backbone networks (ResNet, etc.) directly and also implements custom backbones like HRNet while following Detectron2’s backbone interface.
Feature Pyramid Networks (FPN): The repository uses Detectron2’s FPN implementation and extends it with custom variants like BiFPN and HRFPN.

2. Configuration System

YAML Configuration: FCSGG adopts Detectron2’s YAML-based configuration system, extending it with custom configurations for scene graph generation through add_fcsgg_config().
Command Line Arguments: The training script uses Detectron2’s default_argument_parser() to maintain the same command-line interface.

3. Data Handling

Dataset Registration: Visual Genome dataset is registered with Detectron2’s DatasetCatalog and MetadataCatalog, making it available through Detectron2’s data loading pipeline.
Custom Dataset Mapper: FCSGG implements a custom DatasetMapper class that extends Detectron2’s mapper to handle scene graph annotations.
Data Loaders: The repository uses Detectron2’s build_detection_train_loader and build_detection_test_loader with custom mappers.

4. Training and Evaluation

Trainer Class: FCSGG extends Detectron2’s DefaultTrainer class to customize the training loop, evaluation metrics, and data loading.
Checkpointing: The repository uses Detectron2’s DetectionCheckpointer for model saving and loading.
Distributed Training: FCSGG leverages Detectron2’s distributed training utilities through detectron2.utils.comm and the launch function.
Custom Evaluators: The repository implements a custom VGEvaluator for scene graph evaluation while following Detectron2’s evaluator interface.

5. Visualization and Logging

Event Storage: FCSGG uses Detectron2’s event storage system for logging metrics during training.
Visualization Tools: The repository leverages Detectron2’s visualization utilities for debugging and result analysis.

6. Extensions for Scene Graph Generation

Custom Heads: While using Detectron2’s architecture, FCSGG implements custom prediction heads for relationship detection.
Scene Graph Structures: The repository defines custom data structures for scene graphs that integrate with Detectron2’s Instances class.
Loss Functions: FCSGG implements specialized loss functions for scene graph generation while maintaining compatibility with Detectron2’s loss computation framework.

7. Installation and Dependencies

Submodule Integration: Detectron2 is included as a Git submodule, ensuring version compatibility.
Build Process: The installation process includes building Detectron2 from source to ensure proper integration.

In summary, FCSGG uses Detectron2 as its foundation, leveraging its modular architecture, data handling, training infrastructure, and configuration system while extending it with custom components for scene graph generation. This approach allows FCSGG to benefit from Detectron2’s robust implementation and optimizations while adding specialized functionality for relationship detection between objects.

FCSGG Repo Explanation

http://chen-yulin.github.io/2025/04/23/[OBS]科研-FCSGG Repo Explanation/

Author

Chen Yulin

Posted on

2025-04-23

Updated on

2026-02-15

Licensed under

#Python CV

FCSGG Repo Explanation

FCSGG Repository Summary

Core Components:

Project Structure:

Key Innovations:

Benchmarks:

Usage:

How Detectron2 is Used in FCSGG

1. Core Architecture Integration

2. Configuration System

3. Data Handling

4. Training and Evaluation

5. Visualization and Logging

6. Extensions for Scene Graph Generation

7. Installation and Dependencies

Author

Posted on

Updated on

Licensed under

Comments

Catalogue

Archives

Recents

Tags